Skip to content

#54: Simplify index template to match production#55

Merged
grdumas merged 1 commit into
mainfrom
fix/54-simplify-index-template
Jun 17, 2026
Merged

#54: Simplify index template to match production#55
grdumas merged 1 commit into
mainfrom
fix/54-simplify-index-template

Conversation

@grdumas

@grdumas grdumas commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Summary

Simplifies OpenSearch index template to match production configuration by removing 6 problematic dynamic_templates that caused mapper_parsing_exception errors when indexing documents with scalar values at object-mapped paths.

Problem

The local template defined 11 dynamic_templates with explicit object mappings that conflicted with actual data structures. Production uses only 2 dynamic_templates with a flexible schema approach.

Changes

  • Removed 7 problematic dynamic_templates:

    • run_objects (results.runs.*)
    • timeseries_objects (results.runs..timeseries.)
    • numa_nodes (system_under_test.hardware.numa.*)
    • storage_devices (system_under_test.hardware.storage.*)
    • network_interfaces (system_under_test.hardware.network.*)
    • cpu_flags (system_under_test.hardware.cpu.flags.*)
    • validation_threads (results.runs.*.validation.threads)
  • Added strings_as_keyword dynamic_template:

    • Maps all string fields as keyword type
    • Sets ignore_above: 1024 to prevent indexing overly long strings
    • Matches production behavior
  • Template now has only 2 dynamic_templates (matches production):

    1. test_config_parameters_as_keyword - Maps test_configuration.parameters.* as keyword
    2. strings_as_keyword - Maps all strings as keyword with ignore_above
  • Verified settings match production:

    • dynamic: true (allows flexible schema)
    • storage.enabled: false (prevents mapping conflicts while allowing storage)
    • total_fields.limit: 5000 (matches production)
  • Static mappings preserved:

    • All 6 top-level properties intact (metadata, test, system_under_test, test_configuration, results, runtime_info)

Acceptance Criteria

  • Code changes complete: Template simplified to match production specification
  • Verified: storage.enabled: false setting in place
  • Verified: Template structure matches production (2 dynamic_templates)
  • Apply simplified template to local OpenSearch (manual testing required)
  • Process sample data from all benchmark types (manual testing required)
  • Verify all documents index successfully (manual testing required)
  • Verify java_version field is keyword (not long) (manual testing required)
  • Run migration script: v1 → v2 (manual testing required)
  • Compare v2 mapping against production (manual testing required)

Testing

Code validation completed:

  • JSON syntax validated
  • Template structure verified (2 dynamic_templates)
  • Settings verified (dynamic=true, storage.enabled=false, limit=5000)
  • Static mappings preserved (6 top-level properties)

Manual testing required:
The remaining acceptance criteria require manual verification with a running OpenSearch instance and sample benchmark data. The code changes are complete and match the production specification described in the issue.

Impact

Before: Documents failed indexing with mapper_parsing_exception due to object/scalar type conflicts
After: Template matches production, allowing flexible schema with proper string handling

Related

Files Modified

  • src/chronicler/config/opensearch_index_template.json (-75 lines: simplified dynamic_templates)

Generated with Claude Code (Claude Sonnet 4.5)

Removes 6 problematic dynamic_templates that caused mapper_parsing_exception
errors when indexing documents with scalar values at object-mapped paths:
- run_objects (results.runs.*)
- timeseries_objects (results.runs.*.timeseries.*)
- numa_nodes (system_under_test.hardware.numa.*)
- storage_devices (system_under_test.hardware.storage.*)
- network_interfaces (system_under_test.hardware.network.*)
- cpu_flags (system_under_test.hardware.cpu.flags.*)
- validation_threads (results.runs.*.validation.threads)

Adds strings_as_keyword dynamic_template to match production behavior:
- Maps all string fields as keyword type
- Sets ignore_above: 1024 to prevent indexing overly long strings

Template now matches production with only 2 dynamic_templates:
1. test_config_parameters_as_keyword
2. strings_as_keyword

Settings verified:
- dynamic: true (allows flexible schema)
- storage.enabled: false (prevents mapping conflicts)
- total_fields.limit: 5000 (matches production)

Part of #54
@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: d9c2ccdf-26b9-44be-88ad-127ec8d095d2

📥 Commits

Reviewing files that changed from the base of the PR and between d4c9790 and 1b995d5.

📒 Files selected for processing (1)
  • src/chronicler/config/opensearch_index_template.json

📝 Walkthrough

Summary by CodeRabbit

  • Chores
    • Updated OpenSearch index template configuration to improve string field handling and refine storage field indexing behavior.

Walkthrough

The OpenSearch index template is updated to match production configuration: the top-level dynamic mapping is changed from false to true, a strings_as_keyword dynamic template is added to map string fields as keyword with ignore_above: 1024, and the system_under_test.hardware.storage object is changed from dynamic: true to enabled: false.

Changes

Index Template Mapping Corrections

Layer / File(s) Summary
Dynamic mapping, keyword strings, and storage object
src/chronicler/config/opensearch_index_template.json
Top-level dynamic changed from false to true; strings_as_keyword dynamic template added mapping string fields to keyword with ignore_above: 1024; system_under_test.hardware.storage switched from dynamic: true to enabled: false to prevent storage subfield indexing.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly summarizes the main change: simplifying the index template to match production configuration.
Description check ✅ Passed The description comprehensively explains the problem, changes made, and their impact, all directly related to the template simplification.
Linked Issues check ✅ Passed The PR fully addresses issue #54 objectives: removes 7 problematic dynamic_templates, adds strings_as_keyword, reduces to 2 templates matching production, and verifies all required settings (dynamic:true, storage.enabled:false, limit:5000).
Out of Scope Changes check ✅ Passed All changes are directly scoped to issue #54: only the opensearch_index_template.json was modified to simplify dynamic_templates and match production configuration.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@grdumas grdumas self-assigned this Jun 17, 2026

@grdumas grdumas left a comment

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: #54: Simplify index template to match production

Summary

The PR simplifies the OpenSearch index template by removing complex and problematic dynamic templates, switching to dynamic: true, and aligning the configuration with the production environment. This directly addresses mapper_parsing_exception errors caused by mapping conflicts.

Critical Issues (MUST FIX)

None found.

Security Delta

No security-relevant code was removed. The addition of total_fields.limit: 5000 is a positive security/stability measure to prevent mapping explosion in a dynamic mapping environment.

Major Issues (SHOULD FIX)

None found.

Minor Issues (NICE TO HAVE)

  • Field: results.runs.validation.threads
    The removal of the nested mapping for validation_threads means that cross-field correlation queries within the same thread object in the array will no longer work as expected. However, as this aligns with production and fixes parsing exceptions, it is likely the correct trade-off.

  • Field: system_under_test.hardware.storage
    Changing this field to "enabled": false means that data within the storage object will be stored but not indexed. Users will be able to retrieve this data in the _source document but will not be able to search or aggregate on its internal fields. If searching on storage devices is required, this may need to be revisited.

Nitpicks (OPTIONAL)

None.

Positive Notes

  • Good use of strings_as_keyword with ignore_above: 1024 to ensure consistent mapping of string fields while protecting against overly long values.
  • Aligning local development templates with production is excellent practice to avoid "works on my machine" issues.
  • The PR description is very detailed, clearly explaining the "Before" vs "After" and the rationale for the removals.

Overall Assessment

  • Status: APPROVE
  • Reasoning: The changes are well-justified, follow production standards, and solve a known issue with indexing failures.
  • Next Steps: Merge the PR and verify the fix with local OpenSearch as planned in the manual testing section.

Reviewed by: Gemini Pro via automated code review

@grdumas

grdumas commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator Author

PR Update: Review Response

What was done

No code changes were made. The review is approved and the noted trade-offs are intentional per the production specification.

Addressing the minor notes

1. validation_threads nested mapping removal

The removal of the nested type for results.runs.*.validation.threads is intentional and matches production. While this does mean cross-field correlation queries within thread objects will no longer work, this trade-off:

Per issue #54, production has only 2 dynamic_templates with dynamic: true, allowing OpenSearch to infer types naturally rather than pre-defining nested structures.

2. storage field indexing disabled

Setting system_under_test.hardware.storage to "enabled": false is specified in issue #54 as a required production match:

Production mapping settings:

  • storage: {"type": "object", "enabled": false} (local had enabled: true - now fixed)

This prevents mapping conflicts while preserving data in _source. Storage data remains retrievable but not searchable/aggregatable, which matches production behavior and prevents the parsing exceptions.

Why this approach

Both noted items are not issues to fix but rather intentional design decisions documented in #54 that prioritize:

  1. Eliminating indexing failures (mapper_parsing_exception)
  2. Matching production configuration exactly
  3. Using flexible dynamic mapping over rigid pre-defined structures

Next Steps

PR is ready for manual testing per the acceptance criteria:

  • Apply template to local OpenSearch
  • Process sample benchmark data
  • Verify indexing succeeds across all benchmark types
  • Run v1→v2 migration

Responded by: Claude Sonnet 4.5 via automated workflow

@grdumas grdumas merged commit 1e1ae48 into main Jun 17, 2026
2 checks passed
@grdumas grdumas deleted the fix/54-simplify-index-template branch June 17, 2026 04:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Index template has excessive dynamic_templates causing mapper_parsing_exception

1 participant